AITopics | normalization function

Collaborating Authors

normalization function

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

7988e9b3876ad689e921ce05d711442f-Paper-Conference.pdf

Neural Information Processing SystemsFeb-15-2026, 04:13:41 GMT

artificial intelligence, ground-truth label, machine learning, (15 more...)

Neural Information Processing Systems

Country:

North America > Canada > Ontario > Toronto (0.14)
Asia > China > Chongqing Province > Chongqing (0.04)
Asia > China > Beijing > Beijing (0.04)
(2 more...)

Genre: Research Report (0.68)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

60243f9b1ac2dba11ff8131c8f4431e0-Paper.pdf

Neural Information Processing SystemsFeb-8-2026, 23:16:35 GMT

neural information processing system, normalization function, probability distribution, (13 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.04)
Asia > Middle East > Jordan (0.04)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.41)

Add feedback

Empirical Risk Minimization with $f$-Divergence Regularization

Daunas, Francisco, Esnaola, Iñaki, Perlaza, Samir M., Poor, H. Vincent

arXiv.org Machine LearningJan-21-2026

In this paper, the solution to the empirical risk minimization problem with $f$-divergence regularization (ERM-$f$DR) is presented and conditions under which the solution also serves as the solution to the minimization of the expected empirical risk subject to an $f$-divergence constraint are established. The proposed approach extends applicability to a broader class of $f$-divergences than previously reported and yields theoretical results that recover previously known results. Additionally, the difference between the expected empirical risk of the ERM-$f$DR solution and that of its reference measure is characterized, providing insights into previously studied cases of $f$-divergences. A central contribution is the introduction of the normalization function, a mathematical object that is critical in both the dual formulation and practical computation of the ERM-$f$DR solution. This work presents an implicit characterization of the normalization function as a nonlinear ordinary differential equation (ODE), establishes its key properties, and subsequently leverages them to construct a numerical algorithm for approximating the normalization factor under mild assumptions. Further analysis demonstrates structural equivalences between ERM-$f$DR problems with different $f$-divergences via transformations of the empirical risk. Finally, the proposed algorithm is used to compute the training and test risks of ERM-$f$DR solutions under different $f$-divergence regularizers. This numerical example highlights the practical implications of choosing different functions $f$ in ERM-$f$DR problems.

artificial intelligence, bayesian inference, machine learning, (17 more...)

arXiv.org Machine Learning

2601.13191

Country:

Asia (0.92)
Europe > United Kingdom > England (0.28)
North America > United States > California > Los Angeles County (0.27)

Genre: Research Report (0.81)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Vision (0.92)
Information Technology > Data Science (0.92)
(2 more...)

Add feedback

Tensor-Parallelism with Partially Synchronized Activations

Lamprecht, Itay, Karnieli, Asaf, Hanani, Yair, Giladi, Niv, Soudry, Daniel

arXiv.org Artificial IntelligenceDec-2-2025

Training and inference of Large Language Models (LLMs) with tensor-parallelism requires substantial communication to synchronize activations. Our findings suggest that with a few minor adjustments to current practices, LLMs can be trained without fully synchronizing activations, reducing bandwidth demands. We name this "Communication-Aware Architecture for Tensor-parallelism" (CAAT-Net). We train a 7B parameter CAAT-Net model and show that tensor-parallel communication can be reduced by up to 50% with no significant drop in pretraining accuracy across nearly all evaluated benchmarks. We also experiment with smaller 130M and 1.1B models to show the robustness and scalability of our method. We find that, in some scenarios, validation loss can even improve when reducing communication. Finally, we demonstrate how CAAT-Net accelerates both training and inference workloads across various settings and model sizes.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2506.19645

Country: Asia > Middle East > Israel (0.28)

Genre: Research Report > New Finding (0.86)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

ALIM: Adjusting Label Importance Mechanism for Noisy Partial Label Learning

Neural Information Processing SystemsOct-8-2025, 22:53:21 GMT

Noisy partial label learning (noisy PLL) is an important branch of weakly supervised learning.

artificial intelligence, ground-truth label, machine learning, (15 more...)

Neural Information Processing Systems

Country:

North America > Canada > Ontario > Toronto (0.14)
Asia > China > Chongqing Province > Chongqing (0.04)
Asia > China > Beijing > Beijing (0.04)
(2 more...)

Genre: Research Report (0.68)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

60243f9b1ac2dba11ff8131c8f4431e0-Paper.pdf

Neural Information Processing SystemsAug-14-2025, 19:16:15 GMT

neural information processing system, normalization function, probability distribution, (13 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.04)
Asia > Middle East > Jordan (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

A Dual Optimization View to Empirical Risk Minimization with f-Divergence Regularization

Daunas, Francisco, Esnaola, Iñaki, Perlaza, Samir M.

arXiv.org Machine LearningAug-6-2025

--The dual formulation of empirical risk minimization with f -divergence regularization (ERM-f DR) is introduced. The solution of the dual optimization problem to the ERM-f DR is connected to the notion of normalization function introduced as an implicit function. This dual approach leverages the Legendre-Fenchel transform and the implicit function theorem to provide a nonlinear ODE expression to the normalization function. Furthermore, the nonlinear ODE expression and its properties provide a computationally efficient method to calculate the normalization function of the ERM-f DR solution under a mild condition. Empirical risk minimization (ERM) [1]-[6] is often posed as an optimization problem regularized by a statistical distance between the probability measure to be optimized and a given reference measure [7]-[13].

artificial intelligence, machine learning, normalization function, (15 more...)

arXiv.org Machine Learning

2508.03314

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
Europe > France > Provence-Alpes-Côte d'Azur (0.04)
Oceania > French Polynesia (0.04)
(9 more...)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.56)

Add feedback

Optimal normalization in quantum-classical hybrid models for anti-cancer drug response prediction

Ito, Takafumi, Artem, Lysenko, Tsunoda, Tatsuhiko

arXiv.org Artificial IntelligenceMay-16-2025

Quantum-classical Hybrid Machine Learning (QHML) models are recognized for their robust performance and high generalization ability even for relatively small datasets. These qualities offer unique advantages for anti-cancer drug response prediction, where the number of available samples is typically small. However, such hybrid models appear to be very sensitive to the data encoding used at the interface of a neural network and a quantum circuit, with suboptimal choices leading to stability issues. To address this problem, we propose a novel strategy that uses a normalization function based on a moderated gradient version of the $\tanh$. This method transforms the outputs of the neural networks without concentrating them at the extreme value ranges. Our idea was evaluated on a dataset of gene expression and drug response measurements for various cancer cell lines, where we compared the prediction performance of a classical deep learning model and several QHML models. These results confirmed that QHML performed better than the classical models when data was optimally normalized. This study opens up new possibilities for biomedical data analysis using quantum computers.

artificial intelligence, machine learning, normalization, (16 more...)

arXiv.org Artificial Intelligence

2505.10037

Country: Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.16)

Genre: Research Report (1.00)

Industry: